OPTIMAL SHIFT INVARIANT SPACES AND THEIR PARSEVAL 

FRAME GENERATORS 



AKRAM ALDROUBI, CARLOS CABRELLI, DOUGLAS HARDIN, AND URSULA MOLTER 



Abstract. Given a set of functions J^={/i,...,/m}cL^ (R'' ) , we study the 
problem of finding the shift-invariant space V with n generators {ipi, . . . , ipn} 
that is "closest" to the functions of in the sense that 

m 

V = argmin^/gy^ ^ to,||/i - Pv'fiW'^, 

i = l 

where WiS are positive weights, and Vn is the set of all shift-invariant spaces 
that can be generated by n or less generators. The Eckart- Young Theorem uses 
the singular value decomposition to provide a solution to a related problem 
in finite dimension. We transform the problem under study into an uncount- 
able set of finite dimensional problems each of which can be solved using an 
extension of the Eckart- Young Theorem. We prove that the finite dimensional 
solutions can be patched together and transformed to obtain the optimal shift- 
invariant space solution to the original problem, and we produce a Parseval 
frame for the optimal space. A typical application is the problem of finding 
a shift-invariant space model that describes a given class of signals or images 
(e.g., the class of chest X-Rays), from the observation of a set of m signals or 
images /i, . . . , /m, which may be theoretical samples, or experimental data. 



1. Introduction 

In many signal and image processing applications, images and signals are as- 
sumed to belong to some shift-invariant space of the form: 

5(<I>) closurci^ spaii{ipi{x — fc) : i = 1, . . . , n, fc e Z''} (1-1) 

where $ = {'fii, fn} is a set of functions in L'^{M.'^). The functions ipi, (p2, . . . ,(p„ 
are called a set of generators for the space S — S{^) ~ S{ipi^ . . . , and any such 
space S is called a finitely generated shift- invariant space (FSIS) (see e.g., |BDK94al 
lAGOU ). For example, if n = 1, d = 1 and 4){x) = sinc(a;), then the underlying 
space is the space of band-limited functions (often used in communications). 

Finitely generated shift-invariant spaces, can have different sets of generators. 
The length of an FSIS S is, 

C{S) = min{£ 'E'H -.3 ipi, . . . ,Lpi £ S with S ~ S{(pi, . . . , Lp()} 

We will denote by V„ the set of all shift-invariant invariant spaces with length less 
than or equal to n. That is, an element in Vn is a shift-invariant space that has a 
set of s generators with s < n. 
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In most applications, the shift-invariant space chosen to describe the underlying 
class of signals is not derived from experimental data - for example many signal pro- 
cessing applications assume "band-limitedness" of the signal, which has theoretical 
advantages, but generally does not necessarily reflect the underlying class of signals 
accurately. Furthermore, in applications, the a priori hypothesis that the class of 
signals belongs to a shift-invariant space with a known number of generators, may 
not be satisfied. For example, the class of functions from which the data is drawn 
may not be a shift-invariant space. Another example is when the shift-invariant 
space hypothesis is correct but the assumptions about the number of generators is 
wrong. A third example is when the a priori hypothesis is correct but the data is 
corrupted by noise. In addition, for computational considerations, a shift-invariant 
space of length m could be modeled by a shift-invariant model space with length n 
much smaller than m. For example, in Learning Theory, the problem of reducing 
the number of generators for a subspace of a reproducing kernel Hilbert space is 
also important for improving the efficiency and sparsity of learning algorithms (see 
|SZ04j ). In order to model classes of signals or images by FSIS in realistic cases, 
or to model a very large data set by a computationally manageable shift-invariant 
space, we consider the following problem: 

Problem 1. Given a large set of experimental data = {/i, /2, . . . , /,„} C L^(R'^), 
we wish to determine a shift-invariant space V € Vn (where typically n is chosen 
to be small compared to m) that models the signals in "some" best way. For this 
purpose, we consider the following least squares problem: 



where Wi are positive weights and where Py is the orthogonal projection on V' . 

A space V satisfying H1.2|l will be said to solve Problem 1 for {T, w, n). 

The weights Wi can be chosen to normalize or to reflect our confidence about 
the data. For example we can choose Wi = to place the data on a sphere or 

we can choose a small weight Wi for a given fi if — due to noise or other factors — 
our confidence about the accuracy of fi is low. The goal is to see if we can perform 
operations on the observed data T — {/i, /2, . . . , /m} to construct (if it exists) 
a shift-invariant space iS($) whose length doesn't exceed a small number ?i, that 
minimizes the error with our data T ■ 

Problem ^ can be viewed as non-linear infinite dimensional constrained mini- 
mization problem. It may also be viewed in light of the recent learning theory 
developed in |BCDV05l ICS02I ISZ03j . and estimates of model fit in terms of noise 
and approximation space may be derived. Beside the fundamental question of ex- 
istence of an optimal space, it will be important for applications to have a way to 
construct the generators of the optimal space if it exists, and to estimate the error 



w,n) = ^ WiWfi — -Py/ilP where G V„ is an optimal space for w and n. 



Typical applications involve large data sets (for example consider the problem 
of finding a shift-invariant space model for the collection of chest X-rays using data 
collected by a hospital during the last 10 years). The space S{T) generated by a 
set of experimental data contains all the data as possible signals, but it is too large 
to be an appropriate model for use in applications. A space with a "small" number 




(1.2) 



i=l 



i=l 
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of generators is more suitable, since if the space is chosen correctly, it would reduce 
noise, and would give a computationally manageable model for a given application. 

Least squares problems of the form above in finite dimensional spaces can be 
solved using the singular value decomposition (SVD). Shift- invariant spaces are 
infinite dimensional and the SVD cannot be applied directly. However, due to the 
special structure of shift-invariant spaces, the Fourier transform converts Problem^ 
into finite dimensional least square problems at each frequency as will be discussed 
in Section im 

2. Main Theorems 

In this paper we will sometimes deal with the standard Hilbert space C^. Ele- 
ments of this vector space are column vectors with N coordinates. We will use the 
notation A* and A* to denote the transpose and the conjugate transpose respec- 
tively of a complex matrix A. We will say that a vector y € is a left eigenvector 
of the matrix A associated to the eigenvalue A, if y*A = Ay*. 

For clarity in the exposition, we will consider the unweighted case (wi = l,i = 
1, . . . ,m). The general case can be derived by simply applying the results of the 
unweighted case to the set of normalized observations J- ~ {fi/w1, . . . , fm/w'^}. 

The first theorem establishes the existence of an optimal space V. It also es- 
tablishes that V can always be chosen to be a subspace of the shift-invariant space 
5(JF) generated by the totality of the data. This optimal space V may not be 
unique. However, under additional assumptions that are often satisfied in practice, 
there is only one optimal space V, as stated in Theorem 12.41 

Theorem 2.1. Let T = {/i, . . . , /,„} be a set of functions in L'^(R'^). Then 

(1) There exists V G V,i such that 

m rn 

J2\\f, -Pvf,f <J2\\f'^-Py'M'^ V\/'eV„ (2.1) 

i=l 4=1 

(2) The optimal shift-invariant space V in (12. If) can be chosen such that V C 
Remarks. 

(i) Although we do not make the assumption that n < m, if n > m, then S{J-) 
is an optimal space that belongs to Vn- Thus, we will always assume that 
n < m for the remainder of this paper. 

(ii) In practice it will often be the case that n is chosen (or found) to be much 
smaller than m. 

We still need to explicitly find an optimal space V and estimate the error 

m 

£{J^,n) = mm ^ ||/. - Pyf^l'. (2.2) 

" 1=1 

To compute the error £{J-, n) we need to consider the Gramian matrix Gjr oi J- ~ 
{/i, . . . , fm}- Specifically, the Gramian G$ of a set of functions <f> — {(ySi, . . . , Lpn} 
with elements in L^(M'') is defined to be the n x n matrix of Z'^-periodic functions 



[G$(w)],;j = ^ ipi{uj + k)ipj{uj + k), cj e R'', 
fceZ" 



(2.3) 
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where (pi denotes the Fourier transform of (^i, and where (pi denotes the complex 
conjugate of (pi. It is known that Gq, is ^''-periodic non- negative and self-adjoint 
for almost every uj. In this paper, we use the following definition for the Fourier 
transform of a function (j) € L^(IR''): 

:= / 0(x)e-'2""'" dx, oj e (2.4) 

where dx denotes Lebesgue measure on R''. 

If y = {wi, . . . , Vn} is a set of vectors in a Hilbert space Ti, we will denote by 
(5{V) = <&(vx-,- ..,Vn) the matrix 

[eiV)],^, = {v,,Vj)h z,j = (2.5) 

Our next theorem produces a generator for an optimal space V and provides a 
formula for the exact value of the error, but we first recall the definition and some 
properties of frames used in its statement (see for example |( X iHniOl IHLW02j ) . 

Definition 2.2. Let be a Hilbert space and {ui}i^i a countable subset of Ti. 
The set is said to form a frame for Ti. if there exist q,Q > such that 

g||/||'<5]|</,»^. >|'<QI1/||', V/e7i. 

li q = Q, then {ui}iizj is called a tight frame, and it is called a Parseval frame if 
q = Q = l. 

If {ui}itzi is a Parseval frame for a subspace of a Hilbert space H, and if 
a G H, then the orthogonal projection of a onto W is given by: 

Pw{a) = ^{a,u^)ui. (2.6) 

Thus, a Parseval frames acts as if it were an orthonormal basis of W , even though 
it may not be one. 

Theorem 2.3. Under the same assumptions as in Theorem \2.1V let Xi{uj) > 
A2('-^) > ■ • • > A„,(tj) be the eigenvalues of the Gramian Gj:{lj). Then 

(1) The eigenvalues \i{uj), 1 < i < m are H'' -periodic, measurable functions in 
L^{[Q,lY) and 

m „ 

E{T,n)= / K{Lo)dLo. (2.7) 

*="+i[o,i]'' 

— 1/2 

(2) Let Ei := {uj : Xi{uj) ^ 0}, and define di{ijj) = A, (cj) on Ei and 
<7i{uj) = on Ef. Then, there exists a choice of measurable left eigenvectors 
yi{uj), . . . ,yn{uj) with yi = {yn, ■■■,yimY,'i = l,---,n, associated with the 
first n largest eigenvalues of Gjr{uj) such that the functions defined by 

m 

(fiicu) = d-i{uj)Yyij{uj)fj{uj), i ^l,...,n, uj e R"^ 
i=i 

are in L'^{M.'^). Furthermore, the corresponding set of functions $ = {(^i, . . . , 
is a generator for an optimal space V and the set {(pi{- — k),k g Z'',z = 
1, . . . , n} is a Parseval frame for V . 
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The following example shows that the optimal space V does not need to be 
unique. Let m = 2, n = 1, and let /i,/2 be two orthonormal functions. For this 
situation, Gj:{uj) is the 2x2 identity matrix for almost all lo & M!^. It follows that 
any function ip — cifi+ C2/2 with c = (ci,C2) a unit vector in generates an 
optimal space and £{J-, 1) = 1. Obviously, in this particular case there are infinitely 
many optimal spaces. However, under some mild assumptions, there exists a unique 
optimal space V as described in the following theorem: 

Theorem 2.4. Let T = {/i,...,./m} functions in L^{M.'^) be given. If Xn{uj) > 
A„_|_i(cl;) for almost allco, then the optimal space V in (|2.1|) is unique. In this case, 
n < Tmin = min rank G-r-ftj) and the set — k),k ^ Z'^,? = 1, . . . ,n} in part 

0) of Theorem \2.!^ is an orthonormal basis for V. 

Remarks. 

(i) In case that n = C{S{fi, fm)), Theorem 12.31 gives a proof of the known 
result that every FSIS has a set of generators forming a Parseval frame. 

(ii) It will be clear from the proofs of Theorems 12.11 and 12.31 that the optimal 
space V can be decomposed a.s V = S{(pi) ... S{tpe) where £ = C{V), 
the direct sum is orthogonal and each tpi is a Parseval frame generator of 
Sift)- 

(iii) Theorem 12.41 can only be used when n < m. When n = m then S{J^) is an 
optimal space and it is the unique optimal space if and only if C{S{J-)) ~ m. 

(iv) Obviously, if n = m then the error between the model and the observation 
is null. However, by plotting the error in H2.7() in terms of the number of 
generators, an optimal number n may be heuristically derived. Alterna- 
tively, one may choose n so that a cost functional (depending on the error 
and on n) is optimized as in other dimension reduction schemes. 

3. Preliminaries on Finitely Generated Shift-Invariant Spaces 

In this section we state some known results about finitely generated shift-invariant 
spaces that we will need later. See for example i |Hel64l iBDRQlH iBPHMal IRS95I 
IBowOO| .') 

We need first to introduce some definitions. 

Given / e L^iM."^) and a; e R'' the fiber of / at X is the sequence Txf ~ {f{x+k) : 
k e Z"*}. 

If y is a FSIS (recall Definition and w e [0, 1]'' we set ^ {T^f-J e V} 

the fiber space associated to V and lo. 

If is a closed subspace of a Hilbert space 7i, throughout this article we will 
denote by Pm the orthogonal projection operator in H, onto M. 

With this notation we have: 

Lemma 3.1. /// e L2(M'^), then 

(1) The sequence {T^f)k — (/(w + k)) is a well-defined sequence in ^2(^'^) i-e- 
Lu eR"^; and 

(2) llr^j/jl^j is a measurable function of lo and = = / HFi^/H^ duj. 

Lemma 3.2. Let V be a FSIS in L'^{R'^). Then we have: 

i. Vuj is a closed subspace o/£2(^^) for almost all u G [0, 1]''. 
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ii. V = {f e L^{R'^) : T^f e for almost all lu G [0, 1]'*}. 
111. For each / e L^iR'^) we have that \\^u){Pv .f)\\e2 '■^ measurable function 

of the variable uj and T^{Pvf) = T^^Pyf = Pv^{T^f). 
iv. Let ipi, (fir & L'^{M.'^). We have that 

a. {ifi, ifr} is a set of generators of V, if and only if the fibers 
T^(pi, ...^T^ipr span for almost all lo G [0, 1]'' 

b. the integer translates of Lpi, ...T^Pr are a frame of V, if and only if 
Ti^ifii, ...,T^i^r are a frame of with the same frame bounds, for 
almost all lo G [0, l]"*. 

Lemma 3.3. Let T = {/i, . . . ,/m} be functions in L^(M'') and let A(yi) be the 
infinite matrix Afcj(cj) = {^u>fj^ (k) ~ fj{uj+k), j = 1, . . . , m, fc G , andui G K''. 

Then Gj:{uj) = A'^ (to) A(uj) , and rankGjr{Lo) ~ rankAltu) ~ rankA*{uj), a.e. lu G 
R"^. In particular, Gj:{uj) = 6(r^/i, . . . , T^f„i). 

4. Proofs 

To prove the theorems in Section|21 we proceed in several steps. First we reduce 
the optimization problem into an uncountable set of finite dimensional problems in 
the Hilbert space Ti. — £2i'Z,'^)- We then apply the Eckart- Young Theorem to prove 
that the reduced problems have solutions. Finally, we construct the generators of 
the optimal space patching together the solutions of the reduced problems to obtain 
the solution to the original problem. 

4.1. Reduction. In this section, we reduce Problem^to a set of finite dimensional 
problems. To see this let us first consider the following : 

Problem 2. Let iJ be a Hilbert space, n, m positive integers and A = {ai, am] 
a set of vectors in 7i. We want to find a closed subspace S oi H with dim(S') < n 
that satisfies 

ni m 

- Psa^f < ^ ||a. - Ps'a^\\^ (4.1) 

i=l i=l 

for every subspace S' C H with dim(S") < n. 

If such an S exists, we say that S solves Problem\^for the data {A,n). 

If i? = {hi, br} is a set of vectors from H with S = span(i?) we will say that 
the vectors in B solve Problem\^for the data {A,n). The error for Problem|21is 

m 

(B{A, n) = min V \\a, - Ps'Oif. 

dim(S')<n ^ — ^ 

Note that in Problem [21 we take the minimum over all subspaces of dimension 
less than n, while in Problem^the minimization is taken over a particular class of 
infinite dimensional subspaces, so the two problems are essentially different. 

In the next section we state and prove an extension of the Eckart- Young theorem. 
We conclude from this extension that Problem|21 always has a solution for any set of 
data {A, n) in an arbitrary Hilbert space. That is, given A and n there always exists 
a subspace S with dim(5') < n satisfying H4.1|l . We will also see that a solution S 
can be chosen in such a way that S C span(A) when n < dim(span(A)). 

Before proving these results let us see how Problem|21helps our original question. 
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Let T = {/i, . . . , /„} C L2(R'*). We want to find out if there exists F € V„ 
sucli that V minimizes X^IILi ll/i ^ A'/ilP- Using Lemma f3.ll we obtain that for 
any V G V„, 



j|l^2 



= / Elir-/»-r^W.II'.'^^- (4-2) 



'[0.1]" ^=l 

By Lemma l^fiii), T^Pyfi ~ Py^T^fi. So from (|4.2|) we conchide that 

m ^ rii 

E II/* - ^^/*ii' = / E iir-/' - Pv^^^Ml dcj. 



(4.3) 



The sum inside the integral on the right hand-side of H4.3|l is of the same type than 
the sum that is involved in Problem El in the case that Ji = l-^i^^) and 5 = Kj. 
Since we are assuming that Problem El always has a solution, we know that for 
almost each a; G [0,1]'^ there exists a subspace C 12^'^) that solves Problem 
12 for the data {Tuj^n) where T^j = {Ptj/i, Ftj/m}. Note that the subspace Suj 
does not need to be related with the fiber space of any FSIS. If the function oj i — > 
W^uifi — Psco^ufiWi^ were a measurable function of u then we would have 



/ E l|r-/' - PsSJ^\\l dw<Y. II/' - Py^^ 



(4.4) 



for every V G V„. 

Therefore, in case that there exists a FSIS V € Vn such that V^i = S^j a.e. 
w G [0,1]'^, then by Lemmas 13 . II and 13 . 21 the above function would be measurable 
and V necessarily will be a solution to Problem ^ since 



E ll/« - ^^/'ll' = / E 11^-/' - ^s.r^/,|ll duj < 

„ 7n rn 

/ Eiir--^'-^^^r^/*iil^^ = Eii/*-^^'/'ii' (4.5) 

for every V € V„. 

We will see later that such a FSIS indeed exists. IVIore precisely we will construct 
a set of generators such that its integer translates form a frame of the optimal FSIS. 
We will do that by patching together the fibers of the generators of each of the 
optimal subspaces Sui- 

4.2. Solution to Problem[21 We now prove that Problem[21always has a solution. 

Theorem 4.1. Let Ti. be an infinite dimensional Hilbert space, T = {/i, /,„} C 
?i, X = span {/i, /„}, Ai > ... > Am the eigenvalues of the matrix ©(^) 
defined as in (|2.5|) and yi,...,ym € C™, with yi = {yn, ...,yimY orthonormal left 
eigenvectors associated to the eigenvalues Ai, A^ . Let r = dim A" = rank &{!F). 
Define the vectors gi, g„ G 7i by 

m 

1i = ^i^yvfj^ J = l,---,n (4.6) 

J = l 
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where di = Aj «/ A, ^ 0, and (7^ = otherwise. Then {qi, Qn} is a Parseval 
frame of W = span {gi, ...jgn} and the subspace W is optimal in the sense that 



<a{T,n) = Yl II/' - P^vf^f < II/' - Pw'f^l'', V subspace W',dimW' < n. 

1=1 i=l 

Furthermore we have the following formula for the error 

m 

(£(^,n)= Y A.. (4.7) 

i—n+l 

Remark. If r is small (i.e. r < n) then all the vectors Qr+i, ■ ■ - Tqn are null and 
{gi, Qr} is an orthonormal set. 

One could also choose Qr+i, . . . , gn to be any orthonormal set in the orthogonal 
complement of X and so obtain an orthonormal set of n elements and the formula 
for the error would still hold. 

If H is finite dimensional and n < r, then Theorem 14. II is a consequence of the 
Eckart- Young theorem (see appendix). To prove Theorem 14. II we will reduce it to 
the finite dimensional case and then use the Eckart- Young result. 

We first need the following Lemma: 

Lemma 4.2. Let H be a Hilbert space, {/i, • • • , fm} CH, X = span{fi, ■ ■ ■ , /,„}. 
Assume that there exists M G Ti. with dimil/ < n such that 

m m 

Eii/'-^^//'ii' =^Eii/'-^^^'/'ii' 

i=l i=l 

for any subspace M' C Ti. with dim A/' < n, then there exists W d X , with dim W < 
n, such that 

m m 

j=l i=l 

Proof. Define the subspace W = PxM as the orthogonal projection of M onto X. 
By construction, W C X, and diniM^ < n. 
Let f G X, then we have 

\\f-Pwf\\^ = inf{|l/-.gf : 9 & W} 

< Wf-PxPAlff 

= \\Pxf-PxPMff 

< wf-PMfr- 

□ 

This Lemma shows that, in a possibly infinite dimensional Hilbert space Ti., the 
problem of finding a finite dimensional subspace M G Ti with dim M < n that 
"best approximates" m vectors {/i, . . . , fm}, can always be reduced to a search in 
the finite dimensional space X = span{/i, • • • ,fm}- 

We now prove Theorem 14.11 

Proof, (of Theorem I4.1|l . Let r : X — > C™ be an isometric isomorphism. Set 
hi = T{fi), and let B be the matrix having the vectors hi as columns. So, r = 
dim A" = rank(i?) and B'^B coincides with 25(J^) = {< fi, fj >n}i,j- 
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Choose orthonormal left eigenvectors yi,...,ym G C™, with yi ~ {yn, ...,yimY 
associated to the eigenvalues Ai > ... > of B^B, and define the vectors 

rn 

Ui = ai^ytjb-j, i^l,...,n (4.8) 

— 1/2 

where as before (t^ = A^ if A^ 7^ 0, and ai = otherwise. 

Then, if n < r by Theorem 14.61 in appendix, the subspace M C C"^,M = 
span {ui, u„} satisfies: 

m m 

J2 11^-' - < I] life. - PM'hf (4.9) 

1=1 4=1 

for every subspace M' C C™ with dim M' < n. 

If however, n > r then the left side of (|4.9|) is and therefore the inequality is 
also satisfied. 

Setting W = t-1(M) and noting that t-'^Pm = Pw 

we have from (|4.9|1 
e;(i?,n) = = ^ ||/, - Pw/.||' < E " ^w/'^^H' (^.10) 

i=l i=l 

for every subspace W C 7i with dim W' < n. So, C 7i is optimal for [T, n) 
and qi ~ T^^{ui)^ * = 1, is a Parseval frame for W. Furthermore, the formula 
also holds. □ 

(i) If n > TO the optimal space W is not unique since any space W of dimension 
n containing spanjoi, . . . ,0™} will be optimal. The same argument also 
shows that the space W is not unique if 71 > r = dim X . 

(ii) If n < r, the vectors Ui and yi are related by ^/Xiui — Ayi as described in 
the Appendix. 

4.3. Solution to Problem^ In order to solve Problem^ we need the following 
technical proposition concerning the measurability of the eigenvalues and the exis- 
tence of measurable eigenvectors of a non-negative matrix with measurable entries 
(cf. Lemma 2.3.5].) 

Lemma 4.3. Let G = G{lu) be anmxm self-adjoint matrix of measurable functions 
defined on a measurable subset E d M. with eigenvalues Ai(aj) > X2{'-^) > ■ ■ • > 
Xm{uj). Then the eigenvalues Ai, i = 1, . . . , to, are measurable on E and there exists 
anmxm matrix of measurable functions U = U {oj) on E such that U (w)C/* (w) = / 
a.e. Lo € E and such that 

G{u;) =:U{uj)A{lj)U*{u;) a.e. uj e E (4.11) 

where A{lu) := diag{\i{uj), . . . , Xm{w))- 

Proof (of Theorems O and lOl 

In what follows we will apply Theorem 14.11 to find the solution to Problem ^ 
As before, let T = {/i, . . . , /,„} C L^{R'^) and for to G [0, l]"* let Gjr{uj) be the 
associated Gramian matrix with eigenvalues Ai(ti;) > ... > Am(w) > 0. Let U^co) 
be a measurable m x m matrix as in Lemma |4.3I Since G-p{uj) is Z'^-periodic on 
M'', we can choose U{uj) to be Z'^-periodic as well. Let Ui{uj) denote the i-th row 
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of C/(w). Multiplying (|4.11(l on the left by J7*(w) shows that yi{u}) := Ui{uj)* is a 
left-eigenvector of G{u!) with eigenvalue Xi{uj) for i = 1, . . . ,m. Furthermore, the 
left eigenvectors yi{uj) — (j/ii (w), yim(a;))*, i = 1,...,to, form an orthonormal 
basis of C™. 

For each fixed oj G [0, l]*^, we consider Problem|2in the space £2(2'') for the data 
{T^,n) with = {T^fi, ...,r^/„}. Define ...,g„(a;) € £2(2'*) by 

771 

qi{uj) = ai{uj)^ytj{uj)r^fj, i = l,...,n (4.12) 

— 1/2 

where (Ti(w) = A.j (w) if Ai(w) 7^ 0, and (7i{uj) = otherwise. Since Gjr{uj) = 
©(•^w) (see Lcmma rO|l . Theorem l4 . 1 I shows that the := span {qi{uj), ...,qn{( 

optimizes Problem|21 Moreover, the vectors {qiluj), qn{uj)} form a Parseval frame 
for Scj and we have the following formula for the error 

m 

e{T^,n)= J2 -^^M- (4-13) 

Define now the functions hi : ^ C, i = 1, ...,m, 

m 

/i,(w) = CT,(t^)^yy(w)/j(a;). (4.14) 

i=i 

Since ct^ and yi are measurable functions of lu, then /i^ is also measurable. More- 
over, hi is in L^{]&.'^) as the following simple argument shows. Since 

m 

|/ii(w)p = hi{uJ)h^{uJ) = (T,(w)2 ^ yij{uj)fj{uj)fs{uj) y,s{^) 

j,s=l 

we have (using that if yi is a left eigenvector of the self-adjoint matrix Gjr, then yl 
is a right eigenvector for that matrix associated to the same eigenvalue), 

m rn 

Mu + k)\^ =a,(w)2^y,,(a;)^[G_^(w)]j3y,,(^) 

m 

=(7i([j)^Ai(t^)^yy(w)yy(t^) = o-j(cj)^Aj(cj). (4.15) 

If Xi{uj) 7^ then the product in (|4.15|) is one, otherwise it is zero. That is 
^keZ" 1^*'^^ + = ^{uj:X.{uj)>o} and by LemmaEHl \\h,\\ < 1. 
Now define functions (pi, . . . ,(pn in L^(IR'^) by: 

ipi{u) ^ hiiu), z=l,...,n 

and let — S{tpi, . . . , (pn)- The space 1^ is a shift invariant space of length no 
bigger than n. So G V„. Furthermore, by Lemma 13.21 fiv-a). the space is 
spanned by T^^ipi, i ~ I, n. 

Since {T^ipi){k) = hi{uj + fc) = qi{uj){k), k e Z'*, i = 1, . . . , n a.e., then V^j = 5*^^ 
(the optimal space for the data {T^,n)) in ^2(2''). 

By equation H4.5|) and the comment before, V is optimal, that is V solves Problem 
Qfor the data {T, n). 
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Now, since {Ti^ipi, ...,ri^(fn} is a Parseval frame of S^j for a.e. uj E [0, 1]'* then 
by Lemma rOl fiv-b) the integer translates of (pi, (p„ form a Parseval frame of V. 
On the other hand, Formula H4.3|) says that 

£{T,n)= (B{T^,n)duj. (4.16) 

-'[0,l]'i 

Thus using H4.13|l we have that £{J-, n) — Y^"^=n+i /[o i]'* •^«(^) '-' 

Proof, fof Theorem 12. 4|l . 

Under the hypothesis of Theorem 12.41 Theorem 14.11 and the Remark after it 
guarantee the uniqueness of the optimal spaces associated to the data {T^ , n) 
for almost all oj. Since these fiber spaces characterize the optimal space V, then the 
theorem follows. □ 



Appendix 

4.4. Best linear approximation and the SVD. Here we review the singular 
value decomposition (SVD) of a matrix and its relation to finite dimensional least- 
squares problems. For an overview see |Ste93| . and for a very detailed treatment 
see for example |H.T85| . 

We start with the following proposition. 

Proposition 4.4 (SVD). Let A = [ai, 02, . . . , am] be the matrix with columns 
tti £ C^, m < N. Let r :~ dim span {ai, . . . , a„i}. Then there are m numbers 
Ai > A2 > • • ■ > \r+i ~ ■ ■ ■ ~ Am — 0, an orthonormal collection of m (column) 
vectors yi , . . . , ym G C" , and an orthonormal collection of m ( column ) vectors 
Ml, ... , Um G C such that 

m 

A = J2 ^^kUkVt = UA'/^Y* (4.17) 

k=l 

Where C/ e C^'"™ is the matrix [/ = [ui, . . . , u^], AI/2 = diagiX^^ , X^^), and 
Y = {yi, G wtth U*U = /,„ = Y*Y = YY*. 

The representation of A given in i4.1"/\ l is called the singular value decomposition 
(SVD) of A. 

The SVD of a matrix A can be obtained as follows. Consider the matrix A* A £ 
C™^™. Since A* A is self-adjoint and positive semi-definite, its eigenvalues Ai > 
A2 > ■ ■ ■ > Am are nonnegative and the associated eigenvectors i/i , . . . , j/m can be 
chosen to form an orthonormal basis of C™. Note that the rank r of ^ corresponds 
to the largest index i such that A^ > 0. The left singular vectors ui,. . . ,Ur can 
then be obtained from 

rn 

\/%Ut = Ayi, that is ih = X~)^^^ ^ VijO-j- (1 < « < 0- 

Here yi = {yn, ...,yirnY- The remaining left singular vectors u^+i, . . . ,Mm can be 
chosen to be any orthonormal collection of m — r vectors in that are perpen- 
dicular to span {ai, . . . , a„i}. One may then readily verify that H4.17|l holds. 

The Frobenius norm of a matrix X = [xi,. . . ,Xm] G C^^™ is \\X\\p = tr{X*X), 
where tr denote the trace of a matrix. 
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Now, the following approximation theorem of Schmidt (cf. |Sch07j ') and later 
rediscovered by Eckart and Young ( jEY36j ) shows that the SVD can be used to 
find the subspace of dimension n that is closest to a given finite numbers of vectors. 

Theorem 4.5. Let {ai, . . . , a,„}, be a set of vectors in C^, such that r = dim 
(span{ai, . . . , am}), and suppose A = [ai, . . . , has SVD A = UA^^'^Y* with 
< ?? < r. Then An := Sj=i y^'^jVj satisfies 

1/2 

\A-AJf^ min \\A-B\\f= I ^ a/ 

If \n+i 7^ A„ , then An is the unique such matrix of rank at most n. 
Equivalently, 

Theorem 4.6. Let {ai,...,am}, he a set of vectors in such that r = dim 
(span{ai, . . . , am}), md suppose A ~ [ai, . . . , am], has SVD A — U A}^'^Y* and 
that Q < n < r. If W — span{ui, . . . , then 



{Pwai, . . . , Pwa,n} = ^ \/Kuiy* An 



^ ||a, - Pwa^Wl < ^ ||a, - PMa,\\l V M,dimAf < n, (4.18) 

i=l i=l 

and the space W is unique if Xn+i =/= A„. 
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